In [ ]:
# Timeseries of metaphor use reveal differing reactivities on cable news
## Supplemental Information (SI)
In [1]:
%matplotlib inline
import sys
sys.path.append('../')
# needed to access flask-mongoengine models
import os
# need to create one if you haven't already
os.environ['CONFIG_FILE'] = 'conf/default.cfg'
# will use this .gitignore'd directory to output figures
SAVE_DIR = 'Figures'
if not os.path.isdir(SAVE_DIR):
os.mkdir(SAVE_DIR)
In [12]:
from viomet_9_10_17 import subject_object_analysis
from projects.common import get_project_data_frame
I generate two files currently named viomet-snapshot-project-df.csv
and viomet-2012-snapshot-project-df.csv
, which are the September to Novermber dataframes for 2016 and 2012, respectively. These contain all rows that have been identified as metaphor. These were built using the following commands in Python:
from projects.common import get_project_data_frame
df = get_project_data_frame('Viomet Sep-Nov 2016')
df.to_csv('/Users/mt/Desktop/viomet-snapshot-project-df.csv',
header=True, index=False, na_rep=None)
I then uploaded it to the Metacorps server using scp
.
For completeness I will soon upload the full dataset in .csv form, which will include the potential instances that were either not metaphor or not about politics. This and the other .csv will be made available on a data publishing portal, and mirrored on the Metacorps server.
In [28]:
metaphors_url = 'http://metacorps.io/static/viomet-snapshot-project-df.csv'
project_df = get_project_data_frame(metaphors_url)
print(project_df.columns)
Given the project dataframe, the desired date range, and the corresponding IatvCorpus
name (xxx need to add downloadable data to read from like project_df xxx), the excited state frequency change model can be calculated for every cable news source, and for the frequency of the sources taken as a whole.
By inspecting fit_all_networks
, we can dig deeper into how the model fitting works. We'll return to this. For now, notice that fit_networks
is a dictionary with three keys, one for each of the cable networks we are studying: 'MSNBCW'
, 'CNNW'
, and 'FOXNEWSW'
. The W
stands for west, since the western version of these were the versions recorded in San Francisco. This information can be confirmed by examining the TVNA metadata blobs for each show.
The resulting data, printed to the console below, is presented in tables at the beginning of the Results section.
In [15]:
from viomet_9_10_17 import fit_all_networks
import pandas as pd
date_range = pd.date_range('2016-9-1', '2016-11-30', freq='D')
# uncomment below to run model fits; takes tens of seconds at least
fit_networks = fit_all_networks(project_df, date_range=date_range, iatv_corpus_name='Viomet Sep-Nov 2016')
In [16]:
print(fit_networks)
In [17]:
# set by_network=False to get the fit for all networks taken together
fit_sum = fit_all_networks(project_df, by_network=False, date_range=date_range, iatv_corpus_name='Viomet Sep-Nov 2016')
In [18]:
print(fit_sum)
In [18]:
from viomet_9_10_17 import by_network_frequency_figure
partition_infos = {network: fit_networks[network][0] for network in ['MSNBCW', 'CNNW', 'FOXNEWSW']}
by_network_frequency_figure(
project_df, date_range=date_range,
iatv_corpus_name='Viomet Sep-Nov 2016',
partition_infos=partition_infos,
save_path='Figures/model_fits.pdf'
)
from IPython.display import IFrame
IFrame("Figures/model_fits.pdf", width=600, height=450)
In [4]:
soa_dict = subject_object_analysis(
project_df, plot=True, save_dir=SAVE_DIR, font_scale=1.5
)
In [5]:
# check that the figures were saved to disk
os.listdir(SAVE_DIR)
Out[5]:
In this calculation, we need the partition dates from all models that we calculated above, stored in partition_infos
. We calculate the daily average of the number of times a given violent word was used to activate the source domain. The average daily usage increases disproportionately with attack as the violent word, at least on Fox News. On the other networks, there is a drop in usage of the next most common violent words used, hit, and beat. These appear as tables in the paper. We'll just print out the tables here in the notebook.
In [37]:
from viomet_9_10_17 import by_facet_word
excited, ground = by_facet_word(
project_df, partition_infos, facet_words=['attack', 'beat', 'hit']
)
from IPython.display import display
print('Excited:')
display(excited)
print('\nGround:')
display(ground)
print('\nExcited - Ground:')
display(excited - ground)
Two to-dos are coming together below. One is to generate more intuitive and powerful observables. These are outlined and calculated below. The other is to analyze the 2012 data. I'll do both at the same time below, saving plots for the end.
We should avoid terse variables when possible for NHB. We want to calculate in one table:
In another table:
$Q_\alpha$, where $\alpha$ indicates the source domain cross-section of interest. Specifically, we will calculate excitability quotients for cross sections of the specific violent word in the metaphorical construction, so $\alpha \in \{\text{attack}, \text{hit}, \text{beat}\}$, the three most-common words used for metaphorical violence.
We will also look at sums of cross-sections of who is the subject of metaphorical violence, the one who does the metaphorical violence, and the object of the metaphorical violence, or the victim of the metaphorical violence. As for individuals who could be the subject or object of metaphorical violence, we consider the two Republican and Democratic presidential candidates Mitt Romney and Barack Obama in 2012 and Donald Trump and Hillary Clinton in 2016. We will consider each of them as the subject or object, paired with all other objects/subjects except their rival, and then we'll also consider each candidate as the subject/object with their rival the object/subject. Then for 2016 we would have $\alpha \in \{(\text{Trump}, \text{All}), (\text{Clinton}, \text{All}), (\text{Trump}, \text{Clinton}), (\text{Clinton}, \text{Trump}), (\text{All}, \text{Trump}), (\text{All}, \text{Clinton})\}$. We will calculate total ground state usage and the excitability quotient for each subject/object pair, for each cable news station.
In [10]:
IFrame('https://books.google.com/ngrams/graph?content=attack%2Chit%2Cbeat&year_start=2000&year_end=2016&corpus=17&smoothing=3&share=&direct_url=t1%3B%2Cattack%3B%2Cc0%3B.t1%3B%2Chit%3B%2Cc0%3B.t1%3B%2Cbeat%3B%2Cc0',
width=650, height=400)
Out[10]:
From Google Ngram Viewer, we get that the frequency of attack, hit, and beat are .0067, .0062, and .0034 for their American English corpus in 2008. We can use this to compare frequencies of metaphor with attack, hit, and beat. We could also use the total instances identified through search in our corpus.
All this is well and good, now on to calculating these excitability quotients for 2012.
In [3]:
from project.common import get_project_data_frame
metaphors_url = 'http://metacorps.io/static/data/viomet-2012-snapshot-project-df.csv'
project_df = get_project_data_frame(metaphors_url)
print(project_df.columns)
In [20]:
from viomet_9_10_17 import fit_all_networks
import pandas as pd
IATV_CORPUS_NAME = 'Viomet Sep-Nov 2012'
date_range = pd.date_range('2012-9-1', '2012-11-30', freq='D')
# uncomment below to run model fits; takes tens of seconds at least
fit_networks = fit_all_networks(project_df, date_range=date_range,
iatv_corpus_name=IATV_CORPUS_NAME)
In [23]:
from viomet_9_10_17 import by_network_frequency_figure
partition_infos = {network: fit_networks[network][0]
for network in ['MSNBCW', 'CNNW', 'FOXNEWSW']}
by_network_frequency_figure(
project_df, date_range=date_range,
iatv_corpus_name=IATV_CORPUS_NAME,
partition_infos=partition_infos,
save_path='Figures/model_fits_2012.pdf'
)
from IPython.display import IFrame
IFrame("Figures/model_fits_2012.pdf", width=600, height=450)
Out[23]:
In [ ]:
In [28]:
soa_dict = subject_object_analysis(
project_df, subj_obj=[
('Romney', 'Obama'),
('Obama', 'Romney'),
('Romney', None),
('Obama', None),
(None, 'Romney'),
(None, 'Obama')
],
date_range=date_range,
plot=True, save_dir=SAVE_DIR, font_scale=1.5
)
In [29]:
from viomet_9_10_17 import by_facet_word
excited, ground = by_facet_word(
project_df, partition_infos, facet_words=['attack', 'beat', 'hit']
)
from IPython.display import display
print('Excited:')
display(excited)
print('\nGround:')
display(ground)
print('\nExcited - Ground:')
display(excited - ground)
OK, looks like everything is working well. Time to review the annotated metaphor.
In [ ]: